Efficient Sampling of Parsimonious Inversion Histories with Application to Genome Rearrangement in Yersinia
نویسندگان
چکیده
Inversions are among the most common mutations acting on the order and orientation of genes in a genome, and polynomial-time algorithms exist to obtain a minimal length series of inversions that transform one genome arrangement to another. However, the minimum length series of inversions (the optimal sorting path) is often not unique as many such optimal sorting paths exist. If we assume that all optimal sorting paths are equally likely, then statistical inference on genome arrangement history must account for all such sorting paths and not just a single estimate. No deterministic polynomial algorithm is known to count the number of optimal sorting paths nor sample from the uniform distribution of optimal sorting paths. Here, we propose a stochastic method that uniformly samples the set of all optimal sorting paths. Our method uses a novel formulation of parallel Markov chain Monte Carlo. In practice, our method can quickly estimate the total number of optimal sorting paths. We introduce a variant of our approach in which short inversions are modeled to be more likely, and we show how the method can be used to estimate the distribution of inversion lengths and breakpoint usage in pathogenic Yersinia pestis. The proposed method has been implemented in a program called "MC4Inversion." We draw comparison of MC4Inversion to the sampler implemented in BADGER and a previously described importance sampling (IS) technique. We find that on high-divergence data sets, MC4Inversion finds more optimal sorting paths per second than BADGER and the IS technique and simultaneously avoids bias inherent in the IS technique.
منابع مشابه
Dynamics of Genome Rearrangement in Bacterial Populations
Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unpreceden...
متن کاملOn sampling SCJ rearrangement scenarios
The Single Cut or Join (SCJ) operation on genomes, generalizing chromosome evolution by fusions and fissions, is the computationally simplest known model of genome rearrangement. While most genome rearrangement problems are already hard when comparing three genomes, it is possible to compute in polynomial time a most parsimonious SCJ scenario for an arbitrary number of genomes related by a bina...
متن کاملGenome rearrangements and phylogeny reconstruction in Yersinia pestis
Genome rearrangements have played an important role in the evolution of Yersinia pestis from its progenitor Yersinia pseudotuberculosis. Traditional phylogenetic trees for Y. pestis based on sequence comparison have short internal branches and low bootstrap supports as only a small number of nucleotide substitutions have occurred. On the other hand, even a small number of genome rearrangements ...
متن کاملGenome sequence of Yersinia pestis KIM.
We present the complete genome sequence of Yersinia pestis KIM, the etiologic agent of bubonic and pneumonic plague. The strain KIM, biovar Mediaevalis, is associated with the second pandemic, including the Black Death. The 4.6-Mb genome encodes 4,198 open reading frames (ORFs). The origin, terminus, and most genes encoding DNA replication proteins are similar to those of Escherichia coli K-12....
متن کامل(1+ε)-approximation of Sorting by Reversals and Transpositions
This paper is concerned with the problem of sorting permutations using long range operations like inversions (reversing a segment) and transpositions (moving a segment). The problem comes from computational molecular biology, where the aim is to find a parsimonious rearrangement scenario that explains the difference in gene order between two genomes. In the late eighties, Palmer and Herbon [15]...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 1 شماره
صفحات -
تاریخ انتشار 2009